Chapter 11
Randomness and Complexity
Randomness is a concept deeply entangled with bioinformatics. A random sequence
cannot convey information, in the sense that it could be generated by a recipient
merely by tossing a coin. Randomness is therefore a kind of “null hypothesis”; a
random sequence of symbols is a sequence lacking all constraints limiting the variety
of choice of successive symbols selected from a pool with constant composition (i.e.,
an ergodic source). Such a sequence has maximum entropy in the Shannon sense;
that is, it has minimum redundancy.
If we are using such an ideally random sequence as a starting point for assess-
ing departures from randomness, it is important to be able to recognize this ideal
randomness. How easy is this task? Consider the following three sequences:
1111111111111111111111111111111111
0101010101010101010101010101010101
1001010001010010101011110100101010
each of which could have been generated by tossing a coin. According to the results
from Chaps. 8 and 9, all three outcomes, indeed any sequence of 32 1s and 0s, have
equal probability of occurrence, namely 1 divided by 2 Superscript 321/232. Why do the first two not “look”
random? Kolmogorov supposed that the answer might belong to psychology; Borel
even asserted that the human mind is unable to simulate randomness (presumably the
ability to recognize patterns was—and is—important for our survival). Yet, apparent
pattern is also present in random sequences: van der Waerden has proved that in every
infinite binary sequence at least one of the two symbols must occur in arithmetical
progressions of every length. Hence, the first of the above three sequences would be
an unexceptionable occurrence in a much longer random sequence—in fact, whether
a given sequence is random is formally undecidable. At best, then, we can hope for
heuristic clues to the possible absence of randomness and, hence, presumably the
presence of meaning, in a gene sequence.
© Springer Nature Switzerland AG 2023
J. Ramsden, Bioinformatics, Computational Biology,
https://doi.org/10.1007/978-3-030-45607-8_11
121